Iterative-improvement-based declustering heuristics for multi-disk databases
نویسندگان
چکیده
Data declustering is an important issue for reducing query response times in multi-disk database systems. In this paper, we propose a declustering method that utilizes the available information on query distribution, data distribution, data-item sizes, and disk capacity constraints. The proposed method exploits the natural correspondence between a data set with a given query distribution and a hypergraph. We define an objective function that exactly represents the aggregate parallel query-response time for the declustering problem and adapt the iterative-improvement-based heuristics successfully used in hypergraph partitioning to this objective function. We propose a two-phase algorithm that first obtains an initial K-way declustering by recursively bipartitioning the data set, then applies multi-way refinement on this declustering. We provide effective gain models and efficient implementation schemes for both phases. The experimental results on a wide range of realistic data sets show that the proposed method provides a significant performance improvement compared with the state-of-the-art declustering strategy based on similarity-graph partitioning. r 2003 Elsevier Ltd. All rights reserved.
منابع مشابه
Declustering Objects for Visualization
In this paper we propose a new declustering method which is particularly suitable for image and cartographic databases used for visualization. Our declustering method is based on algebraic techniques using vectors. The algorithm which computes the disk assignment requires O(Kj log K) time where K is the number of parallel disks in the system. The resulting disk assignment maximizes the area tha...
متن کاملScalability Analysis of Declustering Methods for Cartesian Product Files
Efficient storage and retrieval of multi-attribute datasets has become one of the essential requirements for many data-intensive applications. The Cartesian product file has been known as an effective multi-attribute file structure for partial-match and best-match queries. Several heuristic methods have been developed to decluster Cartesian product files over multiple disks to obtain high perfo...
متن کاملDeclustering Databases on Heterogeneous Disk Systems
Declustering is a well known strategy to achieve maximum I/O parallelism in multidisk systems. Many declustering methods have been proposed for symmetrical disk systems, i.e, multi-disk systems in which all disks have the same speed and capacity. This work deals with the problem of adapting such declustering methods to work in heterogeneous environments. In such environments there are many type...
متن کاملcient Disk Allocation for Fast Similarity Searching
As databases increasingly integrate non-textual information it is becoming necessary to support eecient similarity searching in addition to range searching. Recently, declustering techniques have been proposed for improving the performance of similarity searches through parallel I/O. In this paper, we propose a new scheme which provides good declus-tering for similarity searching. In particular...
متن کاملConcentric Hyperspaces and Disk Allocation for Fast Parallel Range Searching
Data partitioning and declustering have been extensively used in the past to parallelize I/O for range queries. Numerous declustering and disk allocation techniques have been proposed in the literature. However, most of these techniques were primarily designed for two-dimensional data and for balanced partitioning of the data space. As databases increasingly integrate multimedia information in ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Inf. Syst.
دوره 30 شماره
صفحات -
تاریخ انتشار 2005